Search CORE

37 research outputs found

Počítačový fond češtiny

Author: Pala Karel
Publication venue: 'CSTUG'
Publication date: 01/01/1991
Field of study

Institute of Mathematics AS CR, v. v. i.

Overview and Future of Czech Wordnet

Author: Pala Karel
Rambousek Adam
Tukačová Sandra
Publication venue: CEUR-WS.org
Publication date: 01/01/2017
Field of study

Czech Wordnet represents one of the national wordnets created during the EuroWordNet and Balkanet projects. However, the data contains various issues that affects the use of Czech Wordnet in NLP applications. Due to lack of resources, it was not possible to update Czech Wordnet thoroughly since the publication of the first version. In 2017, we have started a project to evaluate and update Czech Wordnet, followed by the connection to Collaborative Interlingual Index. This paper provides overview of various updates and extensions of the Czech Wordnet data, and presents the roadmap to publish revised version of Czech Wordnet under open license.Český Wordnet je jeden z národních wordnetů, vytvořených během projektů EuroWordnet a Balkanet. Údaje ve wordnetu bohužel obsahují různé chyby, které ovlivňují použití českého wordnetu v NLP aplikacích. Vzhledem k nedostatečným zdrojům nebylo možno od vydání první verze český wordnet výrazně aktualizovat. V roce 2017 jsme začali pracovat na vyhodnocení a aktualizac českého wordnetu, následované napojením na Collaborative Interlingual Index. Tento článek shrnuje existující verze a rozšíření českého wordnetu a představuje plán na vydání aktualizované verze s otevřenou licencí

Univerzitní repozitář Masarykovy univerzity

Sustainable long-term WordNet development and maintenance: Case study of the Czech WordNet

Author: Adam Rambousek
Aleš Horák
Karel Pala
Publication venue: 'Institute of Slavic Studies Polish Academy of Sciences'
Publication date: 01/01/2018
Field of study

Sustainable long-term WordNet development and maintenance: Case study of the Czech WordNet Czech WordNet represents one of the first national wordnets created during the EuroWordNet and BalkaNet projects. However, the data contains various issues that affect the use of Czech WordNet in NLP applications. Since the publication of the first CzWN version, the semantic network was augmented in several phases, however, complex final editing and publishing process has not been finished. In 2017, we have started a project to evaluate and update the Czech WordNet, followed by a connection to the Collaborative Interlingual Index. In this paper, we provide an overview of Czech WordNet data updates and extensions, and present the roadmap to publish a revised version of the Czech WordNet under open license. Moreover, we introduce a developed concept for long-term updates and maintenance of the data based on crowdsourcing activities. Zrównoważony i długafalowy proces rozwoju i utrzymania wordnetu na przykładzie wordnetu czeskiego Czeski WordNet jest jednym z pierwszych narodowych wordnetów powstałych podczas projektów EuroWordNet i BalkaNet. Jednakże dane zawierają błędy, które wpływają na używanie czeskiego wordnetu w aplikacjach NLP. Od momentu opublikowania pierwszej wersji czeskiego wordnetu sieć semantyczna została rozszerzona w kilku etapach, jednak złożony proces końcowej edycji i publikacji nie został jeszcze zakończony. W roku 2017 zaczęliśmy projekt mający na celu ocenę i aktualizację czeskiego wordnetu, a następnie połączenie go z Collaborative Interlingual Index. W danym artykule przedstawiamy ogólny zarys uaktualnień i rozszerzeń zawartości czeskiego wordnetu, a także prezentujemy plan działania, który doprowadzi do publikacji udoskonalonej wersji czeskiego wordnetu na otwartej licencji. Ponadto prezentujemy opracowaną koncepcję długoterminowych uaktualnień i utrzymania danych w oparciu o działania crowdsourcingowe

Crossref

Biblioteka Nauki - repozytorium artykuÅÃ³w

Directory of Open Access Journals

Univerzitní student a český jazyk

Author: Dědková Magda
Jelínek Milan
Pala Karel
Vojtová Jarmila
Publication venue: Masarykova univerzita
Publication date: 08/04/2014
Field of study

V Universitas 3/2012 pánové Spousta a Šmarda svým úvodním příspěvkem na téma Univerzitní student a český jazyk vyzvali čtenáře revue k diskusi o tomto problému. Své zkušenosti a poznatky zaslalo několik čtenářů a jejich příspěvky jsou otištěny v rubrice Naše diskuse

Masaryk University Journals / Časopisy Masarykovy univerzity

Comparison of prognostic models to predict the occurrence of colorectal cancer in asymptomatic individuals: a systematic literature review and external validation in the EPIC and UK Biobank prospective cohort studies

Objective: To systematically identify and validate published colorectal cancer risk prediction models that do not require invasive testing in two large population-based prospective cohorts. Design: Models were identified through an update of a published systematic review and validated in the European Prospective Investigation into Cancer and Nutrition (EPIC) and the UK Biobank. The performance of the models to predict the occurrence of colorectal cancer within 5 or 10 years after study enrolment was assessed by discrimination (C-statistic) and calibration (plots of observed vs predicted probability). Results: The systematic review and its update identified 16 models from 8 publications (8 colorectal, 5 colon and 3 rectal). The number of participants included in each model validation ranged from 41 587 to 396 515, and the number of cases ranged from 115 to 1781. Eligible and ineligible participants across the models were largely comparable. Calibration of the models, where assessable, was very good and further improved by recalibration. The C-statistics of the models were largely similar between validation cohorts with the highest values achieved being 0.70 (95% CI 0.68 to 0.72) in the UK Biobank and 0.71 (95% CI 0.67 to 0.74) in EPIC. Conclusion: Several of these non-invasive models exhibited good calibration and discrimination within both external validation populations and are therefore potentially suitable candidates for the facilitation of risk stratification in population-based colorectal screening programmes. Future work should both evaluate this potential, through modelling and impact studies, and ascertain if further enhancement in their performance can be obtained

Diposit Digital de la Universitat de Barcelona

Automatic syntactic analysis of Czech text. (An experiment)

Author: Pala Karel
Publication venue: Institute of Information Theory and Automation AS CR
Publication date: 01/01/1968
Field of study

Institute of Mathematics AS CR, v. v. i.